Need to find data for this model

  • Model uses keras.layers.concatenate to join to parts of the model
  • there are two inputs and two outputs

Let's consider the following model. We seek to predict how many retweets and likes a news headline will receive on Twitter.

  • The main input to the model will be the headline itself, as a sequence of words, but to spice things up,
  • our model will also have an auxiliary input, receiving extra data such as the time of day when the headline was posted, etc.

    The model will also be supervised via two loss functions. Using the main loss function earlier in a model is a good regularization mechanism for deep models.

The main input will receive the headline, as a sequence of integers (each integer encodes a word). The integers will be between 1 and 10,000 (a vocabulary of 10,000 words) and the sequences will be 100 words long.


In [7]:
from keras.layers import Input, Embedding, LSTM, Dense, concatenate
from keras.models import Model

In [2]:
# Headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# Note that we can name any layer by passing it a "name" argument.
main_input = Input(shape=(100,), dtype='int32', name='main_input')

In [3]:
# This embedding layer will encode the input sequence
# into a sequence of dense 512-dimensional vectors.
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)

In [4]:
# A LSTM will transform the vector sequence into a single vector,
# containing information about the entire sequence
lstm_out = LSTM(32)(x)

Here we insert the auxiliary loss, allowing the LSTM and Embedding layer to be trained smoothly even though the main loss will be much higher in the model.


In [5]:
auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)

In [8]:
auxiliary_input = Input(shape=(5,), name='aux_input')
x = concatenate([lstm_out, auxiliary_input])

In [9]:
# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)

In [10]:
# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)

This defines a model with two inputs and two outputs:


In [11]:
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])

In [12]:
model.summary()


____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
main_input (InputLayer)          (None, 100)           0                                            
____________________________________________________________________________________________________
embedding_1 (Embedding)          (None, 100, 512)      5120000                                      
____________________________________________________________________________________________________
lstm_1 (LSTM)                    (None, 32)            69760                                        
____________________________________________________________________________________________________
aux_input (InputLayer)           (None, 5)             0                                            
____________________________________________________________________________________________________
concatenate_1 (Concatenate)      (None, 37)            0                                            
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 64)            2432                                         
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 64)            4160                                         
____________________________________________________________________________________________________
dense_3 (Dense)                  (None, 64)            4160                                         
____________________________________________________________________________________________________
main_output (Dense)              (None, 1)             65                                           
____________________________________________________________________________________________________
aux_output (Dense)               (None, 1)             33                                           
====================================================================================================
Total params: 5,200,610.0
Trainable params: 5,200,610.0
Non-trainable params: 0.0
____________________________________________________________________________________________________

We compile the model and assign a weight of 0.2 to the auxiliary loss. To specify different loss_weights or loss for each different output, you can use a list or a dictionary. Here we pass a single loss as the loss argument, so the same loss will be used on all outputs.


In [13]:
model.compile(optimizer='rmsprop', loss='binary_crossentropy',
              loss_weights=[1., 0.2])

We can train the model by passing it lists of input arrays and target arrays:


In [ ]:
model.fit([headline_data, additional_data], [labels, labels],
          epochs=50, batch_size=32)

We could also do:


In [ ]:
model.compile(optimizer='rmsprop',
              loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
              loss_weights={'main_output': 1., 'aux_output': 0.2})

# And trained it via:
model.fit({'main_input': headline_data, 'aux_input': additional_data},
          {'main_output': labels, 'aux_output': labels},
          epochs=50, batch_size=32)